88 research outputs found

    Audio Content-Based Music Retrieval

    Get PDF
    The rapidly growing corpus of digital audio material requires novel retrieval strategies for exploring large music collections. Traditional retrieval strategies rely on metadata that describe the actual audio content in words. In the case that such textual descriptions are not available, one requires content-based retrieval strategies which only utilize the raw audio material. In this contribution, we discuss content-based retrieval strategies that follow the query-by-example paradigm: given an audio query, the task is to retrieve all documents that are somehow similar or related to the query from a music collection. Such strategies can be loosely classified according to their "specificity", which refers to the degree of similarity between the query and the database documents. Here, high specificity refers to a strict notion of similarity, whereas low specificity to a rather vague one. Furthermore, we introduce a second classification principle based on "granularity", where one distinguishes between fragment-level and document-level retrieval. Using a classification scheme based on specificity and granularity, we identify various classes of retrieval scenarios, which comprise "audio identification", "audio matching", and "version identification". For these three important classes, we give an overview of representative state-of-the-art approaches, which also illustrate the sometimes subtle but crucial differences between the retrieval scenarios. Finally, we give an outlook on a user-oriented retrieval system, which combines the various retrieval strategies in a unified framework

    Towards Automated Processing of Folk Song Recordings

    Get PDF
    Folk music is closely related to the musical culture of a specific nation or region. Even though folk songs have been passed down mainly by oral tradition, most musicologists study the relation between folk songs on the basis of symbolic music descriptions, which are obtained by transcribing recorded tunes into a score-like representation. Due to the complexity of audio recordings, once having the transcriptions, the original recorded tunes are often no longer used in the actual folk song research even though they still may contain valuable information. In this paper, we present various techniques for making audio recordings more easily accessible for music researchers. In particular, we show how one can use synchronization techniques to automatically segment and annotate the recorded songs. The processed audio recordings can then be made accessible along with a symbolic transcript by means of suitable visualization, searching, and navigation interfaces to assist folk song researchers to conduct large scale investigations comprising the audio material

    Signal processing methods for beat tracking, music segmentation, and audio retrieval

    Get PDF
    The goal of music information retrieval (MIR) is to develop novel strategies and techniques for organizing, exploring, accessing, and understanding music data in an efficient manner. The conversion of waveform-based audio data into semantically meaningful feature representations by the use of digital signal processing techniques is at the center of MIR and constitutes a difficult field of research because of the complexity and diversity of music signals. In this thesis, we introduce novel signal processing methods that allow for extracting musically meaningful information from audio signals. As main strategy, we exploit musical knowledge about the signals\u27 properties to derive feature representations that show a significant degree of robustness against musical variations but still exhibit a high musical expressiveness. We apply this general strategy to three different areas of MIR: Firstly, we introduce novel techniques for extracting tempo and beat information, where we particularly consider challenging music with changing tempo and soft note onsets. Secondly, we present novel algorithms for the automated segmentation and analysis of folk song field recordings, where one has to cope with significant fluctuations in intonation and tempo as well as recording artifacts. Thirdly, we explore a cross-version approach to content-based music retrieval based on the query-by-example paradigm. In all three areas, we focus on application scenarios where strong musical variations make the extraction of musically meaningful information a challenging task.Ziel der automatisierten Musikverarbeitung ist die Entwicklung neuer Strategien und Techniken zur effizienten Organisation großer Musiksammlungen. Ein Schwerpunkt liegt in der Anwendung von Methoden der digitalen Signalverarbeitung zur Umwandlung von Audiosignalen in musikalisch aussagekräftige Merkmalsdarstellungen. Große Herausforderungen bei dieser Aufgabe ergeben sich aus der Komplexität und Vielschichtigkeit der Musiksignale. In dieser Arbeit werden neuartige Methoden vorgestellt, mit deren Hilfe musikalisch interpretierbare Information aus Musiksignalen extrahiert werden kann. Hierbei besteht eine grundlegende Strategie in der konsequenten Ausnutzung musikalischen Vorwissens, um Merkmalsdarstellungen abzuleiten die zum einen ein hohes Maß an Robustheit gegenüber musikalischen Variationen und zum anderen eine hohe musikalische Ausdruckskraft besitzen. Dieses Prinzip wenden wir auf drei verschieden Aufgabenstellungen an: Erstens stellen wir neuartige Ansätze zur Extraktion von Tempo- und Beat-Information aus Audiosignalen vor, die insbesondere auf anspruchsvolle Szenarien mit wechselnden Tempo und weichen Notenanfängen angewendet werden. Zweitens tragen wir mit neuartigen Algorithmen zur Segmentierung und Analyse von Feldaufnahmen von Volksliedern unter Vorliegen großer Intonationsschwankungen bei. Drittens entwickeln wir effiziente Verfahren zur inhaltsbasierten Suche in großen Datenbeständen mit dem Ziel, verschiedene Interpretationen eines Musikstückes zu detektieren. In allen betrachteten Szenarien richten wir unser Augenmerk insbesondere auf die Fälle in denen auf Grund erheblicher musikalischer Variationen die Extraktion musikalisch aussagekräftiger Informationen eine große Herausforderung darstellt

    A coherent optical link through the turbulent atmosphere

    Full text link
    We describe the realization of a 5 km free space coherent optical link through the turbulent atmosphere between a telescope and a ground target. We present the phase noise of the link, limited mainly by atmospheric turbulence and mechanical vibrations of the telescope and the target. We discuss the implications of our results for applications, with particular emphasis on optical Doppler ranging to satellites and long distance frequency transfer.Comment: version 2, modified following comments from colleagues and reviewer

    Pressure-induced and Composition-induced Structural Quantum Phase Transition in the Cubic Superconductor (Sr/Ca)_3Ir_4Sn_{13}

    Full text link
    We show that the quasi-skutterudite superconductor Sr_3Ir_4Sn_{13} undergoes a structural transition from a simple cubic parent structure, the I-phase, to a superlattice variant, the I'-phase, which has a lattice parameter twice that of the high temperature phase. We argue that the superlattice distortion is associated with a charge density wave transition of the conduction electron system and demonstrate that the superlattice transition temperature T* can be suppressed to zero by combining chemical and physical pressure. This enables the first comprehensive investigation of a superlattice quantum phase transition and its interplay with superconductivity in a cubic charge density wave system.Comment: 4 figures, 5 pages (excluding supplementary material). To be published in Phys. Rev. Let

    Towards polarization-based excitation tailoring for extended Raman spectroscopy

    Get PDF
    Undoubtedly, Raman spectroscopy is one of the most elaborate spectroscopy tools in materials science, chemistry, medicine and optics. However, when it comes to the analysis of nanostructured specimens or individual sub-wavelength-sized systems, the access to Raman spectra resulting from different excitation schemes is usually very limited. For instance, the excitation with an electric field component oriented perpendicularly to the substrate plane is a difficult task. Conventionally, this can only be achieved by mechanically tilting the sample or by sophisticated sample preparation. Here, we propose a novel experimental method based on the utilization of polarization tailored light for Raman spectroscopy of individual nanostructures. As a proof of principle, we create three-dimensional electromagnetic field distributions at the nanoscale using tightly focused cylindrical vector beams impinging normally onto the specimen, hence keeping the traditional beam-path of commercial Raman systems. In order to demonstrate the convenience of this excitation scheme, we use a sub-wavelength diameter gallium-nitride nanostructure as a test platform and show experimentally that its Raman spectra depend sensitively on its location relative to the focal vector field. The observed Raman spectra can be attributed to the interaction with transverse and pure longitudinal electric field components. This novel technique may pave the way towards a characterization of Raman active nanosystems, granting direct access to growth-related parameters such as strain or defects in the material by using the full information of all Raman modes

    Funktionaler Analphabetismus im Erwachsenenalter: eine Definition

    Get PDF
    Der Beitrag stellt eine aktuelle Definition des funktionalen Analphabetismus vor. Ziel ist das Aufstellen einer definitorischen Grundlage, die für Wissenschaft und Praxis nutzbar ist. Diese Definition kann als Kernaussage verstanden werden. Sie ist aber auch anschlussfähig für Erweiterungen, um spezifische Fragestellungen berücksichtigen zu können. Dazu wird auf der Grundlage allgemein akzeptierter früherer Arbeitsdefinitionen eine neue Definition des funktionalen Analphabetismus aufgestellt, elaboriert und diskutiert. Besonderes Augenmerk wird dabei auf die Operationalisierbarkeit gelegt. Der Kern der Definition lautet: Funktionaler Analphabetismus ist gegeben, wenn die schriftsprachlichen Kompetenzen von Erwachsenen niedriger sind als diejenigen, die minimal erforderlich sind und als selbstverständlich vorausgesetzt werden, um den jeweiligen gesellschaftlichen Anforderungen gerecht zu werden. Diese schriftsprachlichen Kompetenzen werden als notwendig erachtet, um gesellschaftliche Teilhabe und die Realisierung individueller Verwirklichungschancen zu eröffnen

    Extraction of anthropometric measures from 3D-meshes for the individualization of head-related transfer functions

    Get PDF
    Anthropometric measures are used for individualizing head-related transfer functions (HRTFs) for example, by selecting best match HRTFs from a large library or by manipulating HRTF with respect to anthropometrics. Within this process, an accurate extraction of anthropometric measures is crucial as small changes may already influence the individualization. Anthropometrics can be measured in many different ways, e.g., from pictures or anthropometers. However, these approaches tend to be inaccurate. Therefore, we propose to use Kinect for generating individual 3D head-and-shoulder meshes from which anthropometrics are automatically extracted. This is achieved by identifying and measuring distances between characteristics points on the outline of each mesh and was found to yield accurate and reliable estimates of corresponding features. In our experiment, a large set of anthropometric measures was automatically extracted for 61 subjects and evaluated in terms of a cross-validation against the manually extracted correspondent

    A Cross-Evaluated Database of Measured and Simulated HRTFs Including 3D Head Meshes, Anthropometric Features, and Headphone Impulse Responses

    Get PDF
    The individualization of head related transfer functions (HRTFs) can make an important contribution to improving the quality of binaural technology applications. One approach to individualization is to exploit the relationship between the shape of HRTFs and the anthropometric features of the ears, head, and torso of the corresponding listeners. To identify statistically significant relationships between the two sets of variables, a relatively large database is required. For this purpose full-spherical HRTFs of 96 subjects were acoustically measured and numerically simulated. A detailed cross-evaluation showed a good agreement to previous data between repeated measurements and between measured and simulated data. In addition to 96 HRTFs, the database includes high-resolution head-meshes, a list of 25 anthropometric features per subject, and headphone transfer functions for two headphone models

    Perceptually Motivated Analysis of Numerically Simulated Head-Related Transfer Functions Generated By Various 3D Surface Scanning Systems

    Get PDF
    Numerical simulations offer a feasible alternative to the direct acoustic measurement of individual head-related transfer functions (HRTFs). For the acquisition of high quality 3D surface scans, as required for these simulations, several approaches exist. In this paper, we systematically analyze the variations between different approaches and evaluate the influence of the accuracy of 3D scans on the resulting simulated HRTFs. To assess this effect, HRTFs were numerically simulated based on 3D scans of the head and pinna of the FABIAN dummy head generated with 6 different methods. These HRTFs were analyzed in terms of interaural time difference, interaural level difference, energetic error in auditory filters and by their modeled localization performance. From the results, it is found that a geometric precision of about 1 mm is needed to maintain accurate localization cues, while a precision of about 4 mm is sufficient to maintain the overall spectral shape
    • …
    corecore